A geometric alternative to Nesterov's accelerated gradient descent
نویسندگان
چکیده
We propose a new method for unconstrained optimization of a smooth and strongly convex function, which attains the optimal rate of convergence of Nesterov’s accelerated gradient descent. The new algorithm has a simple geometric interpretation, loosely inspired by the ellipsoid method. We provide some numerical evidence that the new method can be superior to Nesterov’s accelerated gradient descent.
منابع مشابه
A Variational Perspective on Accelerated Methods in Optimization
Accelerated gradient methods play a central role in optimization, achieving optimal rates in many settings. Although many generalizations and extensions of Nesterov's original acceleration method have been proposed, it is not yet clear what is the natural scope of the acceleration concept. In this paper, we study accelerated methods from a continuous-time perspective. We show that there is a La...
متن کاملAcceleration and Averaging in Stochastic Descent Dynamics
[1] Nemirovski and Yudin. Problems Complexity and Method Efficiency in Optimization. Wiley-Interscience series in discrete mathematics. Wiley, 1983. [2] W. Krichene, A. Bayen and P. Bartlett. Accelerated Mirror Descent in Continuous and Discrete Time. NIPS 2015. [3] W. Su, S. Boyd and E. Candes. A differential equation for modeling Nesterov's accelerated gradient method: theory and insights. NI...
متن کاملAccelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent
Nesterov's accelerated gradient descent (AGD), an instance of the general family of"momentum methods", provably achieves faster convergence rate than gradient descent (GD) in the convex setting. However, whether these methods are superior to GD in the nonconvex setting remains open. This paper studies a simple variant of AGD, and shows that it escapes saddle points and finds a second-order stat...
متن کاملOptimal Algorithms for Distributed Optimization
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth...
متن کاملA Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
We derive a second-order ordinary differential equation (ODE), which is the limitof Nesterov’s accelerated gradient method. This ODE exhibits approximate equiv-alence to Nesterov’s scheme and thus can serve as a tool for analysis. We show thatthe continuous time ODE allows for a better understanding of Nesterov’s scheme.As a byproduct, we obtain a family of schemes with similar ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1506.08187 شماره
صفحات -
تاریخ انتشار 2015